- Semivariogram
- Kriging
- Neighborhood definitions
- Moran's I
We're today usinf the meuse dataset, which is provided by sp package. It comprises of measurements of four heavy metalsin the top soil in a flood plain along the river Meuse, along with a handful of covariates.
data("meuse")
head(meuse)
## x y cadmium copper lead zinc elev dist om ffreq soil ## 1 181072 333611 11.7 85 299 1022 7.909 0.00135803 13.6 1 1 ## 2 181025 333558 8.6 81 277 1141 6.983 0.01222430 14.0 1 1 ## 3 181165 333537 6.5 68 199 640 7.800 0.10302900 13.0 1 1 ## 4 181298 333484 2.6 81 116 257 7.655 0.19009400 8.0 1 2 ## 5 181307 333330 2.8 48 117 269 7.480 0.27709000 8.7 1 2 ## 6 181390 333260 3.0 61 137 281 7.791 0.36406700 7.8 1 2 ## lime landuse dist.m ## 1 1 Ah 50 ## 2 1 Ah 30 ## 3 1 Ah 150 ## 4 0 Ga 270 ## 5 0 Ah 380 ## 6 0 Ga 470
Converting it into a spatial points object:
library(sp)
coordinates(meuse) <- ~x+y
proj4string(meuse) <- CRS("+init=epsg:28992")
mapview(meuse)
Semivariograms show the semivariance (variance of a variable when compared at distant locations). We can estimate an empirical semivarigrams using the gstat package:
library(gstat) g <- gstat(formula = log(zinc) ~ 1, data = meuse) var <- variogram(g) head(var)
## np dist gamma dir.hor dir.ver id ## 1 57 79.29244 0.1234479 0 0 var1 ## 2 299 163.97367 0.2162185 0 0 var1 ## 3 419 267.36483 0.3027859 0 0 var1 ## 4 457 372.73542 0.4121448 0 0 var1 ## 5 547 478.47670 0.4634128 0 0 var1 ## 6 533 585.34058 0.5646933 0 0 var1
Plotting the semivariogram:
p <- ggplot(var, aes(x = dist, y = gamma, size = np)) + geom_point()
We can use the empirical semivariogram to fit a theoretical function describing the semivariance at each distance. There are several models (exponential, spherical, gaussian, etc.) and we can utilize the fit.variogram function to derive the best-fitting model:
var_fit <- fit.variogram(var, model = vgm(c("Exp", "Sph", "Gau")))
var_fit
## model psill range ## 1 Nug 0.05065971 0.0000 ## 2 Sph 0.59060511 897.0011
See https://cran.r-project.org/web/packages/gstat/vignettes/gstat.pdf for further information on semi-variograms!
See vgm() for all available models.
Adding the theoretical semivariogram to the plot:
vari_fit_line <- variogramLine(var_fit, maxdist = max(var$dist)) p <- ggplot(var, aes(x = dist, y = gamma)) + geom_point() + geom_line(data = vari_fit_line, aes(x = dist, y = gamma))
Kriging utilizes the theoretical semivariogram to interpolate values at any location based on distant-variance relationship. Before we can perform Kriging, we need to create a result-layer of points we want to interpolate:
xnew <- seq(meuse@bbox[1, 1], meuse@bbox[1, 2], length.out = 50) ynew <- seq(meuse@bbox[2, 1], meuse@bbox[2, 2], length.out = 50) gridnew <- expand.grid(xnew, ynew) head(gridnew)
## Var1 Var2 ## 1 178605.0 329714 ## 2 178661.8 329714 ## 3 178718.7 329714 ## 4 178775.5 329714 ## 5 178832.3 329714 ## 6 178889.2 329714
names(gridnew) <- c("x", "y")
coordinates(gridnew) <- ~x+y
proj4string(gridnew) <- CRS("+init=epsg:28992")
gridded(gridnew) <- TRUE
mapview(gridnew)
kriging <- krige(log(zinc) ~ 1, meuse, gridnew, model = var_fit)
## [using ordinary kriging]
kriging <- as.data.frame(kriging) p <- ggplot(kriging, aes(x = x, y = y)) + geom_point(aes(col = var1.pred)) + scale_color_gradient(low = "yellow", high = "red") + geom_point(data = as.data.frame(meuse@coords), aes(x = x, y = y)) + coord_equal()
We can smooth the figure by rasterizing the data:
p <- ggplot(kriging, aes(x = x, y = y)) + geom_tile(aes(fill = var1.pred)) + scale_fill_gradient(low = "yellow", high = "red") + geom_point(data = as.data.frame(meuse@coords), aes(x = x, y = y)) + coord_equal()
Using points we can define spatial neighborhoods based on their k nearest neighbors:
library(spdep) knn <- knearneigh(meuse@coords, k = 1) knn <- knn2nb(knn) knn
## Neighbour list object: ## Number of regions: 155 ## Number of nonzero links: 155 ## Percentage nonzero weights: 0.6451613 ## Average number of links: 1 ## Non-symmetric neighbours list
Maximum k of 1, 2, 3, 4:
Similarely, we can define neighborhoods based on a radial distance:
dnn <- dnearneigh(meuse, d1 = 0, d2 = 200) dnn
## Neighbour list object: ## Number of regions: 155 ## Number of nonzero links: 630 ## Percentage nonzero weights: 2.622268 ## Average number of links: 4.064516 ## 5 regions with no links: ## 30 106 108 148 155
Maximum distances of 200, 300, 400 and 500:
Similarely, we can calculate neighborhoods using polygons. Let's first load and plot a polygon of administrative units in Germany:
shp <- raster::shapefile("data/poverty_vietnam.shp")
#mapview(shp)
The data are similar to: https://www.ecologyandsociety.org/vol13/iss2/art24/
Neighborhoods with polygon are defined by touching edges and/or nodes:
nb <- poly2nb(shp, queen = TRUE) nb
## Neighbour list object: ## Number of regions: 614 ## Number of nonzero links: 3206 ## Percentage nonzero weights: 0.8504069 ## Average number of links: 5.221498 ## 7 regions with no links: ## 52 171 364 442 449 521 522
For more information on neighborhood definitions, see: https://cran.r-project.org/web/packages/spdep/vignettes/nb.pdf
For calculating Moran's I, we first need to define a weighted neighborhood file:
neighw <- nb2listw(nb, style = "W", zero.policy = TRUE)
The zero.policy arguments allows for units without neighbors. See ?nb2listw for weighting styles.
moransI <- moran(shp$FGT0, neighw, n = length(nb), S0 = Szero(neighw), zero.policy = TRUE) moran.plot(shp$FGT0, neighw, zero.policy = TRUE)
moran.mc(shp$FGT0, neighw, nsim = 999, zero.policy = TRUE)
## ## Monte-Carlo simulation of Moran I ## ## data: shp$FGT0 ## weights: neighw ## number of simulations + 1: 1000 ## ## statistic = 0.70361, observed rank = 1000, p-value = 0.001 ## alternative hypothesis: greater
localmoran <- localmoran(shp$FGT0, neighw, zero.policy = TRUE) head(localmoran)
## Ii E.Ii Var.Ii Z.Ii Pr(z > 0) ## 0 0.02505234 -0.001631321 0.16486395 0.06571772 0.4738013 ## 1 0.21025902 -0.001631321 0.33135311 0.36809985 0.3563994 ## 2 0.07007469 -0.001631321 0.09826829 0.22874355 0.4095341 ## 3 0.03436825 -0.001631321 0.16486395 0.08866139 0.4646755 ## 4 0.01907820 -0.001631321 0.16486395 0.05100435 0.4796610 ## 5 0.02708123 -0.001631321 0.14107979 0.07644329 0.4695332
shp@data$id <- rownames(shp@data)
shp@data$localmoran <- localmoran[, 1]
shp_df <- dplyr::left_join(fortify(shp, region = "id"), # convert to a data.frame
shp@data,
by = "id")
p <- ggplot(shp_df, aes(x = long, y = lat, group = group, fill = localmoran)) +
geom_polygon() +
scale_fill_gradient2(low = "blue", mid = "white", high = "red") +
coord_equal()